HLJIT at TREC 2017 Real-Time Summarization
نویسندگان
چکیده
This paper describes the approaches used at the TREC 2017 Real-Time Summarization. This task contains two scenarios: push notifications and email digest. For the scenario of push notifications, three filtering models, which are based on the hyperlink-extended retrieval model, the Learning to Rank and the hybrid filtering model, are proposed to filter the relevant tweets for a given topic. A novelty verification method is given for further filter the tweets for push notification. For the scenario of email digest, three ranking models, the hyperlink-extended retrieval model, the retrieval model based on learning to rank, and the personal retrieval model, are presented to rank the relevant tweets. Similarly, a novelty verification is proposed for filtering the redundant tweets. The evaluation results of TREC 2017 Real-Time Summarization show that the performance of our models is competitive.
منابع مشابه
HLJIT at TREC 2016: The Approaches Based on Document Language Model for Real-Time Summarization Track
The paper describes the technology of HLJIT for TREC 2016 Real-Time Summarization Track for microblog. Three summarization approaches under the language model framework, the traditional language model, the temporal document language model and the hyperlink-extended language model, are proposed.
متن کاملNOVASearch at TREC 2017 Real-Time Summarization Track
The rise of large data streams introduces new challenges regarding the delivery of relevant content towards an information need. This information need can be seen as a broad topic of information. One possible strategy to tackle the delivery of the most relevant documents regarding this broader topic is summarization. TREC 2017 Real-Time Summarization (RTS) provides a testbed for the development...
متن کاملS.T at TREC 2017: Real-Time Summarization Track
This paper presents the participation of Shanghai Normal University to the TREC 2017 Real-Time Summarization (RTS) Track. We adopted three different composed methods by applying various factors, i.e., count, cosine and distance to measure relevance between a tweet and a given topic. By setting static relevance threshold for each run, we selected the most relevant but non-redundant tweets and th...
متن کاملCLIP at TREC 2016: LiveQA and RTS
The Computational Linguistics and Information Processing lab at the University of Maryland participated in two TREC tracks this year. The LiveQA and the Real-Time Summarization tasks both involve information processing in real time. We submitted eight runs in the total. In both tasks, our best system had the highest precision among all automatic participating systems. This paper describes the a...
متن کامل